Analysis of SimpleKMeans with Multiple Dimensions using WEKA

نویسندگان

  • Rupali Patil
  • Shyam Deshmukh
  • K Rajeswari
  • Sapna Jain
  • M Afshar
  • Bamshad Mobasher
  • Ritu Sharma
  • M. Afshar Alam
  • Anita Rani
  • Eibe Frank
  • Mark Hall
  • Geoffey Holmes
  • Richard Kirkby
  • Bernhard Pfahringer
  • Pritam Patil
  • Suvarna Thube
  • Bhakti Ratnaparkhi
چکیده

Clustering techniques have more importance in data mining especially when the data size is very large. It is widely used in the fields including pattern recognition system, machine learning algorithms, analysis of images, information retrieval and bio-informatics. Different clustering algorithms are available such as Expectation Maximization (EM), Cobweb, FarthestFirst, OPTICS, SimpleKMeans etc. SimpleKMeans clustering is a simple clustering algorithm. It partitions n data tuples into k groups such that each entity in the cluster has nearest mean. This paper is about the implementation of the clustering techniques using WEKA interface. This paper includes a detailed analysis of various clustering techniques with the different standard online data sets. Analysis is based on the multiple dimensions which include time to build the model, number of attributes, number of iterations, number of clusters and error rate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

El Clustering de Jugadores de Tetris

El análisis del comportamiento en el contexto de los videojuegos se ha convertido en una práctica muy popular debido permite obtener información vital de los jugadores. En modelos de negocio como los videojuegos Free-to-Play (F2P) esta información es importante para aumentar el interés de los jugadores/clientes en el videojuego, la cantidad de jugadores y las ganancias obtenidas. De igual forma...

متن کامل

Identification of the most important factors of ethnic differences in anthropometric dimensions of Iranian workers using the decision tree

Background and aims: Anthropometry is the branch of human science that considers the physical measurement of the human body, especially size and shape. One application of anthropometrical data in ergonomics is the design of working space and the development of industrialized products. So that the tools, equipment and workstations, which designed based on the physical dimensions of the workers, ...

متن کامل

Clustering Student Learning Activity Data

We show a variety of ways to cluster student activity datasets using different clustering and subspace clustering algorithms. Our results suggest that each algorithm has its own strength and weakness, and can be used to find clusters of different properties. 1 Background Introduction Many education datasets are by nature high dimensional. Finding coherent and compact clusters becomes difficult ...

متن کامل

Comparison of Different Classification Techniques Using WEKA for Hematological Data

ABSTRAC : Medical professionals need a reliable prediction methodology to diagnose hematological data comments. There are large quantities of information about patients and their medical conditions. Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining software is...

متن کامل

Effect of Aerobic Exercise Program on Quality of Life in Male Patients with Multiple Sclerosis

Introduction: The purpose of this study was to determine the effect of aerobic exercise program on quality of life in multiple sclerosis men as a complementary therapeutic approach to multiple sclerosis. Methods: This was a semi-experimental study. The statistical sample consisted of 60 people was selected by the available sampling method from Kahrizak Nursing Home where was also a member of t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015